Skip to content

Add gender and nationality columns to admission data pipeline#2

Merged
YichengYang-Ethan merged 1 commit intomainfrom
claude/add-gender-nationality-uGJQp
Mar 15, 2026
Merged

Add gender and nationality columns to admission data pipeline#2
YichengYang-Ethan merged 1 commit intomainfrom
claude/add-gender-nationality-uGJQp

Conversation

@YichengYang-Ethan
Copy link
Member

  • CSV schema: 15 → 17 columns (added gender, nationality)
  • AdmissionRecord: added gender (M/F), nationality (raw), nationality_canonical (domestic/china/hk_tw/other_intl)
  • classify_nationality(): fuzzy-matches Chinese/English nationality strings to canonical values (美籍→domestic, 中国大陆→china, etc.)
  • ProgramStats: added female_rate_accepted, nationality_dist_accepted
  • Feature importance: added gender_f and domestic effect sizes
  • Calibrator predict_outcome: gender diversity bonus (7%) and nationality/domestic advantage (8%) integrated into scoring
  • CLI stats: shows demographics line (gender + nationality breakdown)
  • Sample data: all 30 records annotated with gender + nationality
  • 236 tests passing, ruff clean

https://claude.ai/code/session_014dkZ9Eq3DPVaUfRTeN2HXp

- CSV schema: 15 → 17 columns (added gender, nationality)
- AdmissionRecord: added gender (M/F), nationality (raw),
  nationality_canonical (domestic/china/hk_tw/other_intl)
- classify_nationality(): fuzzy-matches Chinese/English nationality
  strings to canonical values (美籍→domestic, 中国大陆→china, etc.)
- ProgramStats: added female_rate_accepted, nationality_dist_accepted
- Feature importance: added gender_f and domestic effect sizes
- Calibrator predict_outcome: gender diversity bonus (7%) and
  nationality/domestic advantage (8%) integrated into scoring
- CLI stats: shows demographics line (gender + nationality breakdown)
- Sample data: all 30 records annotated with gender + nationality
- 236 tests passing, ruff clean

https://claude.ai/code/session_014dkZ9Eq3DPVaUfRTeN2HXp
@YichengYang-Ethan YichengYang-Ethan merged commit 87a750e into main Mar 15, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants